Distributionally Robust Partially Observable Markov Decision Process with Moment-Based Ambiguity

نویسندگان

چکیده

We consider a distributionally robust partially observable Markov decision process (DR-POMDP), where the distribution of transition-observation probabilities is unknown at beginning each period, but their realizations can be inferred using side information end period after an action being taken. build ambiguity set joint bounded moments via conic constraints and seek optimal policy to maximize worst-case (minimum) reward for any in set. show that value function DR-POMDP piecewise linear convex with respect belief state propose heuristic search iteration method obtaining lower upper bounds function. conduct numerical studies demonstrate computational performance our approach testing instances dynamic epidemic control problem. Our results produce more policies under misspecified distributions as compared POMDP has less costly solutions than POMDP. The are also insensitive varying parameter noise added true probability values obtained period.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust partially observable Markov decision process

We seek to find the robust policy that maximizes the expected cumulative reward for the worst case when a partially observable Markov decision process (POMDP) has uncertain parameters whose values are only known to be in a given region. We prove that the robust value function, which represents the expected cumulative reward that can be obtained with the robust policy, is convex with respect to ...

متن کامل

The Infinite Partially Observable Markov Decision Process

The Partially Observable Markov Decision Process (POMDP) framework has proven useful in planning domains where agents must balance actions that provide knowledge and actions that provide reward. Unfortunately, most POMDPs are complex structures with a large number of parameters. In many real-world problems, both the structure and the parameters are difficult to specify from domain knowledge alo...

متن کامل

Text Understanding With Partially Observable Markov Decision Process

The process of understanding the meaning of a written passage inherently involves dynamic manipulation and composition of ideas. Starting from this observation this thesis proposes an artificial system for text understanding in which the semantic space containing the possible meanings of the analyzed text is selectively explored by a partially observable Markov decision process trained to effec...

متن کامل

Distributionally Robust Markov Decision Processes

We consider Markov decision processes where the values of the parameters are uncertain. This uncertainty is described by a sequence of nested sets (that is, each set contains the previous one), each of which corresponds to a probabilistic guarantee for a different confidence level so that a set of admissible probability distributions of the unknown parameters is specified. This formulation mode...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Siam Journal on Optimization

سال: 2021

ISSN: ['1095-7189', '1052-6234']

DOI: https://doi.org/10.1137/19m1268410